446 research outputs found
Study concept drift in 150-year english literature
The meaning of a concept or a word changes over time. Such concept drift reects the change of the social consensus as well. Studying concept drift over time is valuable for researchers who are interested in language or culture evolution. Recent word embedding technologies inspire us to automatically detect concept drift in large-scale corpora. However, comparing embeddings generated from different corpora is a complex task. In this paper, we propose to use a simple approach for detecting concept drift based on the change in word contexts from different time periods and apply it to subsequent time periods so that the detailed drift could be detected and visualised. We dive into certain words to track how the meaning of a word changes gradually over a long time span with relevant historical events which demonstrates the effect of our method
AIGC In China: Current Developments And Future Outlook
The increasing attention given to AI Generated Content (AIGC) has brought a
profound impact on various aspects of daily life, industrial manufacturing, and
the academic sector. Recognizing the global trends and competitiveness in AIGC
development, this study aims to analyze China's current status in the field.
The investigation begins with an overview of the foundational technologies and
current applications of AIGC. Subsequently, the study delves into the market
status, policy landscape, and development trajectory of AIGC in China,
utilizing keyword searches to identify relevant scholarly papers. Furthermore,
the paper provides a comprehensive examination of AIGC products and their
corresponding ecosystem, emphasizing the ecological construction of AIGC.
Finally, this paper discusses the challenges and risks faced by the AIGC
industry while presenting a forward-looking perspective on the industry's
future based on competitive insights in AIGC
An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning
Byzantine-robust federated learning aims at mitigating Byzantine failures
during the federated training process, where malicious participants may upload
arbitrary local updates to the central server to degrade the performance of the
global model. In recent years, several robust aggregation schemes have been
proposed to defend against malicious updates from Byzantine clients and improve
the robustness of federated learning. These solutions were claimed to be
Byzantine-robust, under certain assumptions. Other than that, new attack
strategies are emerging, striving to circumvent the defense schemes. However,
there is a lack of systematic comparison and empirical study thereof. In this
paper, we conduct an experimental study of Byzantine-robust aggregation schemes
under different attacks using two popular algorithms in federated learning,
FedSGD and FedAvg . We first survey existing Byzantine attack strategies and
Byzantine-robust aggregation schemes that aim to defend against Byzantine
attacks. We also propose a new scheme, ClippedClustering , to enhance the
robustness of a clustering-based scheme by automatically clipping the updates.
Then we provide an experimental evaluation of eight aggregation schemes in the
scenario of five different Byzantine attacks. Our results show that these
aggregation schemes sustain relatively high accuracy in some cases but are
ineffective in others. In particular, our proposed ClippedClustering
successfully defends against most attacks under independent and IID local
datasets. However, when the local datasets are Non-IID, the performance of all
the aggregation schemes significantly decreases. With Non-IID data, some of
these aggregation schemes fail even in the complete absence of Byzantine
clients. We conclude that the robustness of all the aggregation schemes is
limited, highlighting the need for new defense strategies, in particular for
Non-IID datasets.Comment: This paper has been accepted for publication in IEEE Transactions on
Big Dat
Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting
Query rewriting plays a vital role in enhancing conversational search by
transforming context-dependent user queries into standalone forms. Existing
approaches primarily leverage human-rewritten queries as labels to train query
rewriting models. However, human rewrites may lack sufficient information for
optimal retrieval performance. To overcome this limitation, we propose
utilizing large language models (LLMs) as query rewriters, enabling the
generation of informative query rewrites through well-designed instructions. We
define four essential properties for well-formed rewrites and incorporate all
of them into the instruction. In addition, we introduce the role of rewrite
editors for LLMs when initial query rewrites are available, forming a
"rewrite-then-edit" process. Furthermore, we propose distilling the rewriting
capabilities of LLMs into smaller models to reduce rewriting latency. Our
experimental evaluation on the QReCC dataset demonstrates that informative
query rewrites can yield substantially improved retrieval performance compared
to human rewrites, especially with sparse retrievers.Comment: 22 pages, accepted to EMNLP Findings 202
High-dimensional Clustering onto Hamiltonian Cycle
Clustering aims to group unlabelled samples based on their similarities. It
has become a significant tool for the analysis of high-dimensional data.
However, most of the clustering methods merely generate pseudo labels and thus
are unable to simultaneously present the similarities between different
clusters and outliers. This paper proposes a new framework called
High-dimensional Clustering onto Hamiltonian Cycle (HCHC) to solve the above
problems. First, HCHC combines global structure with local structure in one
objective function for deep clustering, improving the labels as relative
probabilities, to mine the similarities between different clusters while
keeping the local structure in each cluster. Then, the anchors of different
clusters are sorted on the optimal Hamiltonian cycle generated by the cluster
similarities and mapped on the circumference of a circle. Finally, a sample
with a higher probability of a cluster will be mapped closer to the
corresponding anchor. In this way, our framework allows us to appreciate three
aspects visually and simultaneously - clusters (formed by samples with high
probabilities), cluster similarities (represented as circular distances), and
outliers (recognized as dots far away from all clusters). The experiments
illustrate the superiority of HCHC
Transboundary marine spatial planning across Europe: Trends and priorities in nearly two decades of project work
As an instrument intended, amongst other things, to reduce transboundary conflicts, Transboundary Marine Spatial Planning (TMSP) has gained significant attention by coastal nations and regions recently, especially in Europe. Rather than leading to a joint marine spatial plan, TMSP is more of a continuous process of transboundary cooperation. This paper discusses the understandings of TMSP, tracks current progress of TMSP projects in Europe and examines their underlying priorities, so as to gain lessons and experience for the development of TMSP in the future. Using the project database of the European MSP Platform, European TMSP-related projects were subject to quantitative and qualitative analysis. The main findings are: (1) there are two accelerating periods of TMSP project development (2006–2010, 2014–2016), which coincide with relevant EU policy development, with the Baltic and Mediterranean Seas accounting for more projects than other sea basins; (2) TMSP projects in different sea basins have different priorities in marine activities and cross-cutting issues, with fisheries and conservation having the largest proportions respectively; (3) most projects are focusing on the pre-planning stages of marine spatial planning processes, and no attention has yet been given to plan implementation in the TMSP projects
(E)-3-(2-Bromophenyl)-1-(3,4-dimethoxyphenyl)prop-2-en-1-one
The crystal structure of the title compound, C17H15BrO3, a chalcone derivative, exhibits two crystallographically independent molecules per asymmetric unit showing an E conformation about the ethylene double bond. In each molecule, the two phenyl rings are almost coplanar: the mean planes make dihedral angles of 9.3 (2) and 19.4 (2)°. In the crystal, molecules are linked through weak intermolecular C—H⋯O hydrogen bonds
Sleep duration in Chinese adolescents: biological, environmental, and behavioral predictors
AbstractObjectiveTo examine sleep duration-related risk factors from multidimensional domains among Chinese adolescents.MethodsA random sample of 4801 adolescents aged 11–20 years participated in a cross-sectional survey. A self-reported questionnaire was used to collect information about the adolescents' sleep behaviors and possible related factors from eight domains.ResultsIn all, 51.0% and 9.8% of adolescents did not achieve optimal sleep duration (defined as <8.0 h per day) on weekdays and on weekends, respectively. According to multivariate logistic regression models, after adjusting for all possible confounders, 17 factors were associated with sleep duration <8 h. Specifically, 13 factors from five domains were linked to physical and psychosocial condition, environment, and behaviors. These factors were overweight/obesity, chronic pain, bedtime anxiety/excitement/depression, bed/room sharing, school starting time earlier than 07:00, cram school learning, more time spent on homework on weekdays, television viewing ≥2 h/day, physical activity <1 h/day, irregular bedtime, and shorter sleep duration of father.ConclusionBiological and psychosocial conditions, sleep environments, school schedules, daily activity and behaviors, and parents' sleep habits significantly may affect adolescents' sleep duration, indicating that the existing chronic sleep loss in adolescents could be, at least partly, intervened by improving adolescents' physical and psychosocial conditions, controlling visual screen exposure, regulating school schedules, improving sleep hygiene and daytime behaviors, and changing parents' sleep habits
A subset of methylated CpG sites differentiate psoriatic from normal skin.
Psoriasis is a chronic inflammatory immune-mediated disorder affecting the skin and other organs including joints. Over 1,300 transcripts are altered in psoriatic involved skin compared with normal skin. However, to our knowledge, global epigenetic profiling of psoriatic skin is previously unreported. Here, we describe a genome-wide study of altered CpG methylation in psoriatic skin. We determined the methylation levels at 27,578 CpG sites in skin samples from individuals with psoriasis (12 involved, 8 uninvolved) and 10 unaffected individuals. CpG methylation of involved skin differed from normal skin at 1,108 sites. Twelve mapped to the epidermal differentiation complex, upstream or within genes that are highly upregulated in psoriasis. Hierarchical clustering of 50 of the top differentially methylated (DM) sites separated psoriatic from normal skin samples with uninvolved skin exhibiting intermediate methylation. CpG sites where methylation was correlated with gene expression are reported. Sites with inverse correlations between methylation and nearby gene expression include those of KYNU, OAS2, S100A12, and SERPINB3, whose strong transcriptional upregulation is an important discriminator of psoriasis. Pyrosequencing of bisulfite-treated DNA from skin biopsies at three DM loci confirmed earlier findings and revealed reversion of methylation levels toward the non-psoriatic state after 1 month of anti-TNF-α therapy
- …